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Abstract 

We develop a new statistical method to reanalyse angular correlations between background 
QSOs and foreground galaxies that are supposed to be a consequence of dark matter inhomo- 
geneities acting as weak gravitational lenses. The method is based on a weighted average over 
the galaxy positions and is optimized to distinguish between a random distribution of galaxies 
around QSOs and a distribution which follows an assumed QSO-galaxy two-point correlation 
function, by choosing an appropriate weight function. With simulations we demonstrate that 
this weighted average is slightly more significant than Spearman's rank-order test which was 
used in previous investigations. In particular, the advantages of the weighted average show 
up if the two-point correlation function is weak. 

We then reanalyze the correlation between high-redshift 1-Jansky QSOs and IRAS galax- 
ies, taken from the IRAS Faint Source Catalog; these samples were analyzed previously using 
Spearman's rank-order test. In agreement with the previous work, we find moderate to strong 
correlations between these two samples; considering the angular two-point correlation func- 
tion of these samples, we find a typical scale of order 5' from which most of the correlation 
signal derives. However, the statistical significance of the correlation changes with the redshift 
slices of the QSO sample one considers. Comparing with simple theoretical estimates of the 
expected correlation, we find that the signal we derive is considerably stronger than expected. 
On the other hand, recent direct verifications of the overdensity of matter in the line-of-sight 
to high-redshift radio QSOs obtained from the shear field around these sources, indicates that 
the observed association can be attributed to a gravitational lens effect. 



1 Introduction 



It was argued by Bartelmann & Schneider (1991, 1992, 1993a, 1993b| ) that a statistical associ- 
ation between foregrou nd ga l axies and distant, radio-loud background sources, as claimed to be 
observed by Fugmann (1988, 199C), can be caused by gravitational lensing effects due to large- 



scale structures of the dark-matter distribution in the Universe. Using Spearman's rank-order test 
to investigate the association of IJy sources with Lick galaxies and IRAS galax i es, th ey indeed 
found correlations at a high level of significance (Bartelmann & Schneider 1993t, 1994, hereafter 
BS). A quantification of such correlations could prove to be a unique tool for directly probing 
dark-matter inhomogeneities. It is therefore very important to verify the results of these analyses 
with independent methods and, if possible, to improve their statistical significance, or to obtain 
more detailed information about the association, e.g. the amplitude of the correlation function or 
a characteristic angular scale. Other groups have shown a statistically significant association be- 



tween other samples of QSOs and foreground matter: Rodrigues- Williams & Hogan (1994), Seitz 
& Schneider ( 1995D and Wu & Han ( |1995[ ) have shown evidence for an overdensity of Abell and 
Zwicky clusters around high-redshift QSOs, and Hutchings (1995) has studied the distribution of 
galaxies around seven QSOs at z = 2.3, and found a statistically significant excess around all of 
them. Whereas he did not interpret this result as being due to gravitational lensing, it appears 
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to be a more natural explanation than assuming that these galaxies are spatially associated with 
the QSOs, which would imply an enormous luminosity evolution. 

Direct evidence for the presence of lensing matter in the line-of-sight towards high-redshift 
QSOs was obtained by Fort et al. (1995) who imaged faint galaxies around several high-redshift 
IJy QSOs. For several of them, they obtained clear evidence for a coherent shear pattern around 
these QSOs, which can be attributed to local concentrations of faint galaxies. These concentrations 
may indicate the presence of a group or a cluster, but they are so faint optically that they would 
not appear in any cluster catalog. What this might suggest is that there exists a population of 
clusters with a much larger mass-to-light ratio than those clusters which are selected because of 
their high optical luminosity, i.e., which appear in optically-selected cluster catalogs. If these 
findings are confirmed (e.g., by HST observations), one has found a way to obtain a mass-selected 
sample of clusters and/or groups. 

In the following sections, we introduce a new approach to test for correlations (Sect. ^), and use 
numerical simulations (Sect. |3|) to demonstrate that it is applicable and can lead to an improvement 
of the significance of the results, if compared to Spearman's rank-order test. Section ^ then presents 
the results that are obtained with our method for IJy quasars and IRAS galaxies. In addition, we 
perform further simulations to compare our findings with what is expected from theory. Finally 
we summarize and present our conclusions in Sect. ^ 



2 Method 

In this section we define our new correlation test, based on a weighted average. Furthermore we 
show how it can be applied for measuring quasar-galaxy associations and how to make use of an 
a priori guess (e.g. from theory) of the quasar-galaxy correlation function to find an optimum 
weight function. 

For readers who lack the patience to follow the arguments below, we now give a brief summary 



of the results of this section, which should enable him or her to directly go to Sect. 2.3. 

Given a sample of QSOs, and a sample of galaxies around these QSOs, such that <j)i is the an- 
gular separation of the i-th galaxy from its associated QSO we then define a correlation coefficient 
rg by 

1 ^ 

i=l 

where gifj)) is a. weight function. Given an assumed two-point correlation function S,qg{(f>) between 
QSOs and galaxies, we show that the optimal choice of the weight function to allow the distinction 
between the assumed two-point correlation function and a random distribution of galaxies relative 
to the QSOs is given by 

g(^)=aCqg(0)+6 , (1) 

with arbitrary (a 7^ 0) constants a, b. 

2.1 Definitions 

For any realisation (x, y) of a pair ^X, of n-dimensional random variables X := (Xi, . . . , X„) 
and Y := (Yi, . . . , y„) we define a correlation coefficient by 

n 

r(x,^) :=^5(xO-/(y.) (2) 

i=l 

with arbitrary functions g and /. Formally, this can be understood as an average of g (xi) weighted 
with / (yi) (or vice versa), but without the usual normalisation of the weights. In our application 
below, the vector Y denotes the counts of galaxies in n concentric rings around quasars, and the 
vector X denotes the radii of these rings. 
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Given the probability density Po{x,y) of (^X,Y^ and one single realisation {x',y') of two 
random variables, it is our aim to decide whether the assumption of {x' , y') being a realisation of 
^X, can be rejected or not. For this purpose we use Po {x, y) to determine the distribution of 

the correlation coefficient for realisations of (jl, , i.e. we calculate the cumulative probability 

Po{r>R)= J- - J d"xd"ypo {x, y) 6 (r {x, y) - R) 

for r taking a value greater than or equal to some threshold R for any realisation of {^X, Y^ . 

Now suppose we find a value R' := r{x' , ff) of the correlation coefficient of {x' ,y'), and let us 
define e as 

e:=Po{r>R') . 

Premising (af' , if) to be a realisation of {^X, Y^ , we know that e is the probability of the correlation 

coefficient to give a result greater than or equal to R'. Hence, if e happens to be very small (or 
very large because then 1 — £ = Po(^ < R') is very small) we might reject our premise but rather 
assume (x' , if) to be drawn from a different pair of random variables {X' , Y'). 

As a consequence of this strategy, we would erroneously conclude for a fraction e of all reali- 
sations of (^X,Y^ that they are not drawn from (^X,Y^, because their correlation coefficient is 

greater than or equal to R'. Throughout this paper the value of e will therefore be called the 

'error level'. 

Up to now, we did not specify the functions g and /. By intuition one is lead to the idea that 
they should be adapted to the given problem. Imagine we want to check whether can be 

considered a realisation of (^X,Y^. Furthermore, we suspect that {x' ,y') has been drawn from 

different random variables {X' ,Y') with a corresponding probability density pa.{X' ,Y'), where 
the subscript 'a' stands for the 'alternative hypothesis'. What we want to achieve then is that 
the correlation test allows for an optimum distinction between these two hypotheses. For both of 
them we can, in principle, derive the probability density of the correlation coefficient r from the 
equations 



Po{r) = j ■■■ j dJ'xdJ'ypo{x,y) 5{r{x,y)-r) , 
Pa (r) = / • • • / d"a; d"2/pa {x, y) 5 (r (f , y) - r) , 



where S (r) denotes Dirac's delta function. The first definition one can think of to quantiiy the 
distinction between Po (r) and Pa {r) is the mean error level 

{Po (r > R)) := j Po{r> R) p. {R) dR . (3) 

According to the preceding explanations the mean error level should be as small (or large) as 
possible. 

If Po (r) and Pa {r) are characterised by a single (e.g. Gaussian-like) peak, we can also expect 
the quantities 

Qi := ^""^^ ~ , (4) 

0. := ^^^^-^ , (5) 

Ca 



3 




Figure 1 : A simple measure of the distinction of two Gaussian-like distributions is the separation 
of their mean values in units of either one of their standard deviations. The shaded area represents 
the error level Pq [r > (r)^). 



together with the definitions 

(r) 

\ / a.o 

to be good measures of the distinction between po (r) and Pa (r)- Figure |l| illustrates this concept. 
In the following section we want to specify the functions g and /. We claim that by maximizing 
either \Qi \ or \Q2\ we can find g and / such that r as a test for quasar-galaxy correlations operates 
close to its optimum sensitivity, which we will demonstrate in Sect. |^ by means of numerical 
simulations. 




2.2 Quasar-galaxy correlations 

We now want to apply these theoretical concepts to the associations between quasars and galaxies 
by analysing the radial distribution of galaxies in the vicinity of quasars from a given sample. 
Suppose that for each quasar we have a list of relative angular coordinates of galaxies within a 
certain (preferably circular) field around the quasar. 

One way to investigate the galaxy distribution is to merge these lists into one total list corre- 
sponding to a total galaxy field. For the correlation test we choose an inner and an outer angular 
radius, (/)in and cfiout, to define a ring [0in,</'out] within the total field which is divided into n 
concentric annuli [gi, gi+i] by the n + 1 radii 

ft := 0in + (« - 1) • ^^^^^ — , i = 1, . . . ,n 4- 1 . (6) 

n 

Let Zi denote the number of galaxies within [gi, gi+i] and set x := {gi, . . . , p„), y := (zi, . . . , z„). 
Then, the correlation coefficient (0) reads 



n 



(7) 



If we increase the number n of sub-rings, while the total number TV := zi + • • • + z„ of galaxies 
within [(/)in, 0out] is constant, we will eventually reach the situation where each of the sub-rings 
contains one galaxy at most, i.e. = or 1 for i = 1, . . . , n. At that point / (zi) in Eq. (j^) will 
only take the two values / (0) and / (1), so obviously / (0) 7^ / (1) is required, because otherwise 
r would become independent of the galaxy distribution and useless for the correlation analysis. 
From these statements it follows that we can find two constants a and b such that, for = or 1, 

f [zi) := a- f [zi) +b= ^ , i = 

which in turn results in a linear transformation of the correlation coefficient 

n 

r:=^g{gi)f{zi) = a-r + b' , 
1=1 

where 

n 

b'^b-Y,g{g,) . 

i=l 

A substitution of r with r in Eqs. (jj) and (j^) yields Q12 = sign (a) • Qi^2 and \Qi,2\ — |Qi,2|- 
This shows that the linear transformation / ^ / does not influence the maximisation of \Qi\ or 
\Q2\, our means for optimizing the correlation test. If n is large enough (i.e. Zi = 0, 1) we can 
therefore choose / {zi) = Zi/N without loss of generality, so we get 

1 " 

i=l 

Furthermore, if the radial positions of the N galaxies within [0in,'/'out] are denoted with (pj, 
j = 1, . . . , A^, it is clear that with increasing n the radii of sub-rings [pi, Qi+i] containing one of 
the galaxies approach the values (pj. As = for all the other sub-rings, we can define 

1 " 1 ^ 

rg{(j)i,...,(l)N) -.^ yira^ — ^z^g{Q^) ^ —^g{(j)j) . (8) 

1=1 3=1 

Note that, by taking the limit n cx3, we finally got rid of the arbitrary number n of sub- rings 
within the galaxy field under consideration. This is a first advantage of the 'weighted average' 
correlation test compared to Spearman's rank test, which was applied in earlier investigations (e.g. 
BS) and required a binning of the data. 

In the following, we will assume the galaxies to be distributed independently of one another. 
That means the galaxy distribution within [cpi-a, 0out] can be described by a one-dimensional radial 
probability density p (0), which defines the probability p (0) d0 for a single galaxy to be found at 
some position between two given angular radii (j) and (p + dcf). Because of galaxy-galaxy correlations 
our assumption is a simplification and does not hold in general. In Sect, ^we argue that it should 
nevertheless be applicable in this analysis. 

One approach to statistically prove an association between quasars and galaxies is to start with 
the 'null-hypothesis' of no association and then try to reject it with the help of the correlation 
coefficient (||). Normally the null- hypothesis will be characterised by a radial probability density 
Po (</') ~ 0j corresponding to a Poissonian galaxy distribution. However, if the galaxy fields around 
the individual quasars that we merge to form the total galaxy field do not all have the same angular 
radius or are irregularly shaped, Po (0) may look quite differently. Therefore, we will write 

Po (0) =: Co • G(0) , 

using a geometrical factor G {(j)) > and a normalisation constant Co > to meet the condition 

/ Po (</-) d0 = Co • / G(0)d0 = l . 
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The theory of gravitational lenses, together with some cosmological model, can give quantita- 
tive predictions about the expected angular quasar-galaxy two-point correlation function ^qg (0) 
(cf. Bartelmann 1995| ). For galaxy positions which are independent of one another, this implies a 



one-dimensional probability density pa (0) , which we will write in the form 
where Ca (0) is related to ^qg via the expression 

Ca (<^) = C • [1 + Cqg (0)] . 

As before, normalisation is required, i.e. 

ut r't'oMt 

p^{^)d<jy^C- [l-t-eqg(<^)]G(^)d</)=l 



Given an estimate of ^qg (0) from theory we suspect that the true distribution of galaxies might 
be described by {<p) , so we want the correlation coefficient ^ to be optimised for a distinction 
between po (0), representing the null- hypothesis, and pa (0)- Applying definition this can be 
achieved by choosing the weight function g (</>) such that the global maximum of |Qi| is reached. 
A variational calculation, shown in Appendix A.l, yields Eq. (^), 

9W=a^qgi4') + b , (9) 

with arbitrary (a ^ 0) constants a and b. Similarly, one finds 

9W= (10) 

when maximising \Q2\- Such an optimisation of the correlation test is most important if po {4') ~ 
Pa(0), i-e. Cqg (0) ^ 1: because in this case it will be most difficult to rule out one of the 
possibilities. In the interesting regime of ^qg <C 1, however, the expansion (1-1-^) ^wl — ^ 
implies Eqs. ^ and (p^ to be nearly equivalent. Moreover, Appendix A. 2 gives the proof that 
the weight function (|9|) is also a stationary point of the mean error level (Po i^g > R)), as long as 

Us i^) « 1- 

2.3 Numerically derived quasar-galaxy two-point correlation function 

The theoretically expected angular two-point correlation function between background quasars 



and foreground galaxies has been derived by Bartelmann (1995) for cold dark matter (CDM) 
and hot dark matter (HDM) Einstein-de Sitter cosmological models with linearly evolved pertur- 
bation spectra. For the following analyses we will pick out the quasar-galaxy correlation func- 
tion ^qg (0) which Bartelmann extracted from a numerical simulation|^ of a CDM universe with 
Ho = 100km/(sMpc) taking into account galaxies up to 21*^' magnitude. Figure ^ shows plots of 
^qg ((/)) from the numerical simulation (solid line) and of the approximation 



Cqg (0) « e' (0) a (-^o + <^/deg)-'-^ , (11) 

with a = 0.0036 and (j)o = 0.24 (dashed line). Whereas Bartelmann's correlation functions are all 
normalised such that 

poo 

eqg(0)0d</> = O , 







this is obviously not true for ^' ((/>), because ^' (0) > on e [0,oo]. In this sense ^' (0) is not 
a valid fit to ^qg ((/>) . Nonetheless the approximation does resemble three important features of 

^ For details, see the original paper. 



6 



0.1 
0.08 

0.06 

0.07 




5 10 15 20 25 
[arcmin] 

Figure 2: Solid line: Quasar-galaxy two-point correlation function from a CDM Einstein-de Sitter 
cosmological model with Hubble constant Ho = 100 km/(s Mpc). Dashed line: Approximation 
according to (|lT|). 

the correlation function: ^' (</>) is steep for small 0, flattens at (/) ~ 0o and is very shallow for 
large values of cj). We will therefore use ^' (0) to derive an approximation to the optimum weight 
function. To allow for different values Hq — h ■ 100km/(sMpc) of the Hubble constant we have 
to multiply]^ (j> by the dimcnsionless parameter h. Setting a = 1/a and 6 = we thus obtain from 
Eq. §) 

5(<^) = (O.24 + /i0/deg)-'-^ . (12) 




3 Numerical simulations 

Before applying the newly introduced weighted-average correlation test to observational data we 
want to show the results of some numerical simulations. They have been performed in order to 
check whether our method is more sensitive than the Spearman's rank-order test, and to investigate 
the influence of variations of the weight function. 

3.1 Overview 

In principle, we proceed as follows: First, a large number (we use 10^) of circular galaxy fields 
is generated, each of them consisting of N independently and randomly chosen galaxy positions. 
We interpret them as a statistical sample of total galaxy fields in the case of the null-hypothesis 
of no quasar-galaxy correlation and use them to derive the distribution Pq {rg > R)- A second 
set (size: 10^) of random fields is synthesized to model observed galaxy fields with an assumed 
association of quasars and galaxies. Again, the coordinates are drawn independently, but this 
time according to some radial probability density profile Pa ('/')j which is meant to describe the 
correlation. For a given weight function g {(j)) we calculate (r)^ as well as {Po{rg > r)) from 
the individual values r of the correlation coefficient r^. We use these quantities to analyse the 
sensitivity of our weighted-average correlation test as described below. 

Both the error level Po {vg > (r) ) of the mean correlation coefficient and the mean error level 
(Po (rg > r)) quantify the distinction between our two simulated samples: very low values (or 



This can be concluded fro m Egs . (2.12), (2.32) and (2.33) of Bartelmann (1995): Observing that fc ~ /i and 



kg ^ h (see e.g. Padmanabhan, 1993) one finds ^qg {(f>; h = h') = C (h') gqg {h' 4>',h = l). 
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values near unity) signify a good distinction, values around 0.5 a bad distinction. The effect of 
deviations of the weight function from its optimum g ((/>) can be examined by defining a parame- 
terized weight function g (0, e) such that g {(p^e = So) = g (0) for some e = Eq and repeating the 
simulation for a whole set of values for e. 

Additionally we subject each of the galaxy fields of the second sample to Spearman's rank-order 
test to obtain the corresponding error level of the mean correlation coefficient and the mean error 
level. That allows for a direct comparison between the weighted-average method and Spearman's 
rank-order test, a short description of which is given below: 

• The circular galaxy field is divided into n — 25 rin gs of equal area. The number 25 has been 
adopted from the Bartelmann & Schneider ( 1994 ) analysis of quasar-galaxy associations. 

• The number of galaxies within each ring is determined and ranked, yielding a scheme of 
ranks {111, . . .,11^^) with 7^f e {1, . . . , 25}. 

• The distance of the rings from the center is ranked in descending order, i.e. rings closer to the 
center are ranked higher. The result is a second rank scheme {'R}^, . . . , TZ^^^ — (25, 24, . . . , 1). 

• Spearman's rank-order correlation coefficient Ts is calculated from its definition as the linear 
correlation coefficient of the rank schemes. 



where 

n 

^^•":=(l/n)5]7^^^ . 

1=1 

Taking into account that the rank schemes are always permutations of {1, . . . ,n} this ex- 
pression can be simplified to 



Ts — 1 



b^2 



n (n? — 1) 

• If the number of galaxies within each ring is independent of the distance of the ring from 
the center and independent of the numbers of galaxies within the other rings, then the rank 
schemes {TZf} and {Tii} are random permutations of each other. In this case the distribution 

for n > 10 is excellently approximated by a Student-t-distribution. Therefore, the error level 
of Spearman's rank-order test is given by 

P(r >R)-l 5^-^^ fori?>0 , 

^^"^-^^"1 fori?<0 , 

with Iz (a, b) denoting the incomplete beta function 



h (a, b) 



8 




Figure 3: Left panel: "Correlation profile" Ca (0) according to expression (|13|). Right panel: 
Parameterized weight function 5 (0, e) as defined in (14) for e = 0.75 (solid line), e = (dotted 
line) and e = 1 (dashed line). 



3.2 Results 



In the following, all angles will be measured in degrees and treated as dimensionless quantities. On 
circular galaxy fields the null-hypothesis corresponds to a probability density of galaxy positions 

Po (0) = Co G (0) . 

Let us fix the radius of the simulated galaxy fields to 0.5 (degrees) and set 

G((/)):=0 , 

which requires Co = 8 in order to assure the normalisation of Po {(f)) ■ For our first series of 
simulations we arbitrarily choose 



Pa (0) = Ca (0) G (0) := ( y 0^ - 320 + 16 ) • G (0) 



(13) 



to describe a hypothetical quasar-galaxy correlation on the galaxy fields (Fig. |[ left panel) . More- 
over we define a parameterized weight function (Fig. ^ right panel) 



g(0,£) :=(16£-8)02 



1 



(14) 



that satisfies the condition (|i|) for an optimum weight function if e = 0.75. From these expressions 
we can derive the mean correlation coefficient 



1 ^ 
N ^ 



N 



0.5 



9 (0, e) Pa (0) d0 = 



7- 32e 
45 



(15) 



The outcome of simulations for different values of N carried out as explained in the preceding 
section is summarized in Fig. |4| The individual plots visualize the error level of the mean correla- 
tion coefficient or the mean error level, respectively, as a function of the weight-function parameter 
£. In the heading of each panel one can find, apart from the number of galaxies N , the results 
obtained from Spearman's rank-order test (which are, of course, independent of e), denoted by 
E^. Two important facts can be seen in Fig. 0: 
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• Both the error level of the mean correlation coefficient and the mean error level do indeed 
attain a minimum for a weight function near the expected optimum g (0, e = 0.75). However, 
the exact position of the minimum is slightly displaced with respect to the predicted one. 

• Regarding the error level of the mean correlation coefhcient as well as regarding the mean 
correlation coefficient, the weighted-average correlation test yields a more significant distinc- 
tion between the two samples of galaxy fields than Spearman's rank-order test. This remains 
true even if the weight function deviates significantly from its optimum. 

As to the stated displacement of the minimum of the mean error level we refer to Appendix A. 2 
where the optimum weight function (|l|) is shown to approach a stationary point of the mean error 
level if Pa (0) comes close enough to Po (</>). This suggests that the real position of the minimum 
should be closer to the predicted one if the difference between pa (0) and po ((f>) is smaller. In 
order to verify this, we performed a second series of simulations (in this case the null-hypothesis 
was modelled using only 5 • 10^ random fields), now setting (cf. Fig. left panel) 

32 

Pa(0)=Ca(0)G(0):=-(202-30 + 4).G(0) . (16) 

The right panel of Fig. ^ displays the mean error level for N = 200 galaxies in the total galaxy 
field. Clearly, the minimum is located closer to the expected position than before. We can also 
see from Fig. ||that Spearman's rank-order test is actually not too bad compared to our new test. 
It should also be noted that the gain in sensitivity of our new test is obtained by providing more 
a priori information to the test, namely the shape of the expected two-point correlation function. 
On the other hand, the correlation function chosen here is particularly favourable for Spearman's 
rank- order test, since it can be approximated nearly as a linear function of the angular radius on 
the scales considered. 



4 Data 

Now that the weighted average as a method to detect a possible quasar-galaxy associations has 
been tested numerically, we are going to reanalyze the correlations between IJy quasars and IRAS 
galaxies reported in BS. Because of the high significance of their results it seems promising to try 
to find additional information such as a correlation scale or amplitude. 

4.1 Sample selection 

Our investigation is based on exactly the same data set as was used in BS; the reader is referred 
to this paper for details. 

The positions and photometric data of the galaxies are taken from the Infrared Astronomical 
Satellite (IRAS) Faint Source Catalog, applying the criterion 

S^Q > Si2 5*25 

to identify a source as a likely galaxy, where 5„ denotes the flux at n micron. To exclude very 
faint as well as very strong nearby sources the sample is then restricted to objects within the range 

0.3Jy < S-go < IJy . 

A sample of quasars is provided by the optically identified fraction of the IJy catalog. This 
catalog contains bright extragalactic radio sources with a 5-GHz flux of above 1 Jy (Kiihr et al. 
1981| , Stickel et al. [1993^ Stickel & Kuhr |19934 |1993b| ). From the 426 radio sources with known 



redshift we select different subsamples, each of them characterized by a lower and an upper limit 
2;niin and Zmax of the redshift as well as an upper limit mmax of the apparent magnitude. 

To perform the statistical analysis we extract from the IRAS catalog a ring-like galaxy field of 
radii (j)in and 0out around the position of each of the selected quasars. Merging them as explained 
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Figure 4: Error level of the mean correlation coefficient (left column) and mean error level (right 
column) for the weighted-average correlation test as a function of the parameter e of the weight 
function (|lj). The heading of each panel displays the corresponding results from Spearman's 
rank-order test (£'s) and the number of galaxies N. 
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Figure 5: Left panel: "Correlation profile" Ca (0) according to Eq. ( |16| ) (solid line) and constant 
value Co — 8 (dashed line) . Right panel: Mean error level for the weighted-average correlation test 
as a function of the parameter e of the weight function (|lj). The number of galaxies is iV = 200. 
As in Fig. ^ Es denotes the corresponding result obtained from Spearman's rank-order test. 



in Sec. 2.2 we end up with a total ring-like galaxy field of N galaxies, which can be subjected to the 
weighted-average correlation test. It has already been pointed out that we assume the galaxies to 
be distributed independently of one another. In particular this assumption enters the derivation of 
the optimum weight function and is fundamental for the calculation of any error level given in this 
paper, because our null-hypothesis is a Poissonian distribution of galaxies. However, the number 
density of IRAS galaxies is of the order of 1 per square-degree, and the individual galaxy fields 
are taken from very different positions on the sky. As a consequence, galaxy-galaxy correlations 
on the merged total field should be negligible. 

4.2 Correlation analysis 

In order to allow for a direct comparison of our results with those obtained in BS using Spearman's 
rank-order test, we first adopt the value of (f>out = 28.21 arcmin. This radius had been chosen such 
that the area Trip^^^ around each of the quasars was 2500 arcmin^, in agreement with an earlier 



investigation of IJy-sources and Lick galaxies by Fugmann (1990). 

As the IRAS catalog contains a few sources which are located within a few arcseconds from 
the position of one of the IJy-quasars (cf. BS) and can therefore possibly be identified with the 
corresponding quasars, we remove a small circle from the center of our galaxy fields by setting 
0in = 10 arcsec. 

Tables |l] and I summarize the results of the correlation tests. The different columns present the 
following information: Zmin denotes the minimum redshift, Zmax the maximum redshift, rTimax the 
maximum apparent magnitude of the selected quasar subsample and A^q the number of quasars 
within this subsample. The total number of galaxies is N and the two remaining columns display 
the error levels £wa and Sgp in percent as obtained from the weighted-average and Spearman's 
rank-order tests, respectively. The former is derived from a numerical simulation of 10^ random 
(Poissonian) galaxy fields, the latter is taken from BS. In Table where no value of Zmax is given, 
the subsamples do not have an upper limit of the redshift, i.e. formally it is ^max = oo. 

A comparison of £aw and Sgp reveals similar properties of the error levels of the weighted average 
and Spearman's rank test concerning their dependence on Zmin and Wmax. However, for subsamples 
with Zmin < 1.5 the weighted average very often yields strikingly lower values of the error level 
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Table 1: Results of the correlation tests between IJy quasars and IRAS galaxies. The error level 

(in %) as obtained from the weighted average is denoted by Cwa, whereas Ssp is the error level 
from the Spearman's rank test as reported in BS. The remaining symbols are Zmin and mmax for 
the minimum redshift and maximum apparent magnitude of the quasar subsample, Nq for the 
number of quasars, N for the number of galaxies and (f)out denoting the outer radius of the galaxy 
fields. 









0out = 28.21' 




' ' "max 


Nn 


N 






yJ.OXJ 


91 00 


938 




. ( 


Z6.0 


u.ou 


90 on 
zu.uu 


91 S 




A 

.4 


1/11 
14. 1 


n Pin 
u.ou 


1Q nn 
ly .uu 


1 09 


1/11 


1 Q 
l.o 


on ^ 


u.ou 


lo. ( 




1 07 


O Q 
Z.O 


Qn K 
oU.iJ 


u.ou 


1 R Pin 


101 


11/1 
114 


O 8 
Z.O 


/I O Q 

4y.o 


n Pin 
u.ou 


1 9^1 


1 9"^ 
izo 


»y 


1 n 

ly.o 


oU.y 


u.ou 


1 s nn 

lo.UU 


1 1 A 
11^ 


89 




4fi 9 




21 nn 


1 7Q 


1 1 n 

iiy 




1 O 

lo.y 


U. 1 


9n nn 

ZU.UU 


100 


1 1 Q 

llo 


1 /I 
1.4 


1 A n 
ID.U 


U. ( 


1 Q nn 


1^^ 


lUo 


Q 7 
O. ( 


1 ^ o 
ID.Z 


n 7'^ 


lo. ( o 


1 9f! 
1 zo 


yo 


8 /I 
0.4 


0/1 /I 
Z4.4 


n 7<^ 


1 8 |^n 


1 9n 

IZU 


8/1 
84 


O /I 

y.4 


on n 
zy.u 


n 7'^ 


lo.zo 


87 


fi/1 
04 


/I O 1 
4z. 1 


7 

O ( . / 


n 7'^ 


1 8 nn 


8n 

OU 


58 


67 4 


38 9 


1.00 


21.00 


130 


90 


4.8 


32.4 


1.00 


20.00 


123 


86 


3.1 


22.5 


1.00 


19.00 


107 


79 


8.6 


29.3 


1.00 


18.75 


94 


70 


5.5 


17.8 


1.00 


18.50 


88 


61 


6.0 


24.4 


1.00 


18.25 


60 


47 


52.5 


53.8 


1.00 


18.00 


54 


43 


74.0 


73.2 


1.25 


21.00 


97 


65 


1.0 


2.8 


1.25 


20.00 


93 


63 


.9 


4.2 


1.25 


19.00 


80 


56 


3.1 


4.4 


1.25 


18.75 


68 


47 


1.4 


4.7 


1.25 


18.50 


64 


46 


1.1 


6.2 


1.25 


18.25 


42 


33 


26.6 


14.2 


1.25 


18.00 


37 


29 


48.9 


38.2 


1.50 


21.00 


59 


33 


9.5 


0.9 


1.50 


20.00 


56 


33 


9.5 


0.2 


1.50 


19.00 


46 


30 


5.6 


0.7 


1.50 


18.75 


36 


23 


3.8 


0.4 


1.50 


18.50 


34 


22 


2.9 


0.3 


1.50 


18.25 


20 


14 


7.9 


0.2 


1.50 


18.00 


18 


14 


7.9 


1.2 



13 



Table 2: As table Hbut for non-overlapping redshift intervals of the quasar subsamples. 
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Figure 6: Distribution of IRAS galaxies around IJy quasars of redshift z > 0.5 and apparent 
visual magnitude to < 21. 

(and thus statistically more significant results) than Spearman's rank test. This effect becomes 
even more important if taking into account that the £sp have been derived without removing the 
mentioned IRAS counterparts of the IJy quasars from the galaxy fields. Such counterparts can 
be found in the subsamples with Zj^m < 1, so the reported values of Cgp for them are probably too 
low. 

As shown in BS, the IRAS catalog probably extends to galaxy redshifts up to 1. As a conse- 
quence there might be a spatial association between IRAS galaxies and IJy quasars with a redshift 
below z = 1 which, of course, could show up in our correlation tests. Therefore, the quasar sub- 
samples with Zinin > 1 are the most interesting ones in the context of gravitational lensing. Table ^ 
gives a better insight into the dependence of the correlations upon the quasar redshift, because the 
listed results are based on non-overlapping redshift bins. Surprisingly, we find an anticorrelation 
for 1.0 < z < 1.25, but at a significance below 10%. 

Now we want to investigate the question of how the results are influenced by the choice of the 
outer radius 0out of the galaxy fields. As Spearman's rank test is sensitive to the overall gradient 
of the galaxy number density over the field, (/)out must be adapted to the angular scale of the 
expected correlations. But then a low error level can equally be induced by a galaxy overdensity 
near the center or an underdensity in the outer regions of the field, relative to the mean number 
density on large scales. Figure ^ shows the distribution of galaxies around IJy quasars on fields 
of radius 1 deg. Each dot corresponds to one galaxy: The position along the horizontal axis 
indicates its distance to the quasar, whereas on the vertical axis the quasar redshift can be read 
off. The quadratic scaling of the horizontal axis assures that a constant galaxy number density 
on the galaxy fields transforms to a constant number density of dots in the plot. Considering the 
galaxy distribution around quasars of redshift z > 1.5, i.e. the dots above the dashed line, one 
finds that there seems to be a "hole" in the range 0.15 < (j)^ / (Ideg)^ < 0.25 which is the outer 
region of galaxy fields with 0out ~ 30 arcmin. This deficit of galaxies could possibly yield a major 
contribution to the low error levels recorded in Tables ^ and ^ for Zmin — 1-5. To remove that 
contribution one would like to increase ^outi thereby gaining information about the large scale 
mean galaxy number density. On the other hand this will decrease the overall density gradient 
once 0out becomes larger than the correlation length scale and therefore the significance of the 
result from Spearman's rank test will decrease. 

This problem can be overcome with the weighted- aver age correlation test, if a weight function 
g {4>) like ( p^ is applied. On the angular scale 4> < 4>o — 0.24 deg the strong dependence of the 
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Figure 7: Error level for rejecting the null-hypothesis of a Poissonian galaxy distribution as a 
function of the outer field radius (jjout for four different quasar subsamples. The redshift range of 
the quasars is denoted by z, their maximum visual magnitude is rrimax = 21 for the solid lines and 
Wniax = 19 for the dotted lines. 



expected correlations on (j) enables the weighted average to detect a gradient of the galaxy number 
density. For angular distances cj) ^ (j)o the weight function g {(f)) is flat. Therefore, in the outer 
regions not the exact distribution of galaxies but only their mean number density is relevant. This 
means the weighted average both is sensitive to a quasar-galaxy correlation on a fixed angular 
radius 4>o and at the same time includes the information on the mean galaxy density from the 
outer parts of the field. Consequently, and in contrast to Spearman's rank test, our method can 
be used to effectively analyse large galaxy fields. 

For four different quasar subsamples Fig. ^ displays the error level for rejecting the null- 
hypothesis of a Poissonian galaxy distribution as a function of the outer field radius 0out < 2 deg. 
The curves illustrate that a relatively small variation of 0out can have a significant effect on the 
resulting value of the error level, even if 0out is beyond 1 deg. Tables ^ and ^ summarize the 
results obtained by reanalysing all the quasar subsamples as listed in tables ^ and |^ with an outer 
radius of 2 deg. 

A more detailed view of the angular galaxy distribution at small distances to IJy quasars can 
be obtained from the correlation functions plotted in Fig. They are calculated using ring-like 
total galaxy fields of radii = 10 arcsec and ipo^t = 0.5 deg constructed according to Sec. |2.2| 
by merging the individual galaxy fields around the quasars of a given subsample. As before, the 
inner part is removed to ensure that the fields are not contaminated by the quasars themselves. 
A total galaxy field is then divided into n = 10 sub-rings [ft, Qi+i] given by Eq. (|^). Denoting the 
number of galaxies within sub-ring no. i by Zi and its solid angle by Ai :— tt (gf+i — of) we derive 
the value from 

and assign it to the "mean radius" (ft -I- ft+i) /2 of the ring. 

In the diagrams of Fig. ^ the corresponding points are connected by a dashed line which can 
be interpreted as an approximation to the quasar-galaxy correlation function. To give an estimate 
of the errors we do not attach error bars of length y/zi to a graph as it is commonly done, because 
their significance level is strongly dependent on zt if Zi is small. Instead we numerically generate a 
set of 1000 artificial galaxy fields with a random (Poissonian) galaxy distribution. The number of 
galaxies on each of them is equal to that of the total galaxy field of the regarded quasar subsample. 
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Table 3: As table m but with an outer galaxy field radius of 0out — 2°. 
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Table 4: As table || but with an outer galaxy field radius of 0out = 2°. 
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Figure 8: Quasar-galaxy correlation function for IJy quasars and IRAS galaxies determined from 
different quasar subsamples. The redshift range of the quasars is denoted by z, their max;imum 
visual magnitude by mmax- 
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They are subjected to the same procedure as the real fields and for each sub-ring we derive the 
mean value of and a 70% error bar: 15% of the simulations yield a value of below the lower 
end of the error bar, the values of an equal fraction are above the upper end. 

The selected quasar subsamples are characterised in the diagram headings by the redshift z of 
the quasars and their apparent visual magnitude to. The bottom right panel which is calculated 
from our largest subsample shows a strong signal of a correlation. However, if compared to Fig. ^ 
the length scale of the correlation appears to be smaller and its amplitude to be much greater than 
expected from the gravitational lens effect. This is not too surprising, because as it was mentioned 
above there might be a spatial association between IRAS galaxies and low redshift quasars. The 
remaining three panels are obtained from three neighbouring high-redshift subsamples. They 
visualize the results from Table |^: quasar-galaxy anticorrelation for the quasar redshift range 
1.0 < z < 1.25, correlations for 1.25 < z < 1.5 and z > 1.5. From the top right diagram it is 
evident that the correlation for quasars with 1.25 < z < 1.5 is produced by a significant galaxy 
overdensity close to the quasars, again with an amplitude greater than expected from the numerical 



simulations reported by Bartelmann (1995), whereas it seems to be caused by a sharp decrease of 
the galaxy number density at greater distances in the highest redshift subsample. This decrease 
corresponds to the "hole" in the galaxy distribution which we have already seen in Fig. ^ 

4.3 Additional simulations 

Given our null-hypothesis of a purely random distribution of galaxies around the IJy quasars we 
have so far quantified our correlation results in terms of the error level. The error level assigned to 
a value R of the correlation coefficient r has been introduced to be the probability e := Pq (r > R) 
for r to take a value equal to or greater than R for a Poissonian galaxy distribution. 

Suppose now we reject the null hypothesis for some of the quasar subsamples, because the 
associated error level is low. Assuming the quasar-galaxy correlation to be induced by gravitational 
lensing we would expect it to be described approximately by a correlation function as derived by 



Bartelmann (see Sec. 2.2), a fit to which we gave in Eq. (|11|). With this new premise we can then 
again ask the question: What is the probability Pa {r > R) for the correlation coefficient to take 
a value equal to or greater than R or, equivalently, what is the probability to find an error level 
equal to or lower than e. 

In order to find an answer we generated a large number of synthetic galaxy fields of radius 
0out = 30' following the correlation function (11) and containing 100 galaxies each. We subjected 



them to both Spearman's rank-order test and the weighted-average correlation test with the op- 
timum weight function (|l3|) {h = 1). The fraction of simulated fields resulting in an error level 
equal to or lower than e then directly yields the required probability, which is plotted in Fig. 
left panel, as a function of e. The solid line presents the result for the weighted-average test, the 
dashed line for Spearman's rank test. The dotted line would arise for both correlation tests if 
the simulated galaxies were distributed randomly; it just reflects the definition of the error level. 
Repeating the whole procedure with 3000 galaxies per field produces the right panel of Fig. ^. 

As to quasar subsamples with a total number of galaxies N « 100 on fields of (j)out ~ 30' Fig. ^ 
clearly shows that it is hardly more probable to find low values of the error level in the case of the 
expected, lensing-induced quasar-galaxy correlation than it is for randomly distributed galaxies. 
When interpreting this result one should be aware of two facts: On the one hand, N — 100 is a 
typical number for rather large quasar subsamples (cf. Tables ^ and ^) and for subsamples with 
fewer galaxies the difference between the correlated and the random distributions will be even 
smaller. On the other hand, the correlation function ( |ll| ) is a fit to what Bartelmann derived from 
numerical simulations for an artificial galaxy catalog, which includes all galaxies with a visual 
magnitude m < 21. Therefore it is quite probable that the IRAS catalog is deeper in redshift than 
this synthetic catalog and shows a stronger correlation with quasars. Nonetheless, as the curves 
in Fig. ^ are so extremely close together, we conclude that either 

• the low error levels are the result of a statistical fluctuation and do not correspond to a real 
correlation, or 
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100 galaxies 3000 galaxies 




Figure 9: Fraction of simulated galaxy fields yielding an error level equal to or lower than some 
limit e for a galaxy distribution according to the theoretically expected quasar-galaxy correlation 
analysed with the weighted-average test (solid line) or Spearman's rank-order test (dashed line) 
and for a purely random galaxy distribution analysed with either one of the tests (dotted line). 
The number of galaxies per simulated field is 100 for the left panel and 3000 for the right panel. 



• the numerical results in Bartelmann (1995) significantly underestimate the expected cor- 
relations between QSOs and IRAS galaxies, either because the non-linear density inhomo- 
geneities are not properly resolved in these simulations, or because the assumed redshift 
distribution is much shallower than that of IRAS galaxies, or 

• the expected angular scale (/)o is smaller than obtained by Bartelmann, as might be indicated 
by the correlation function plotted in Fig. ^, or 

• for some reason the correlation is not described correctly by assuming a gravitational lensing 
effect by the large-scale structure. However, in view of the results by Fort et al. ( 1995| ) quoted 
in the introduction, we consider this latter possibility unlikely. 



5 Summary and conclusions 

In this paper, we introduced a new statistical method, the weighted-average correlation test, to 
look for angular correlations between background quasars and foreground galaxies. As was pointed 
out by Bartelmann & Schneider, such correlations are predicted by gravitational lens theory under 
plausible assumptions concerning the quasar luminosity function and the dependence between the 
distributions of dark and luminous matter. 

Whereas the methods applied in earlier investigations of this phenomenon need to group the 
galaxies according to their positions into some arbitrarily defined bins, the weighted average uses 
the exact distance of each individual galaxy to the corresponding quasar. Furthermore, if addi- 
tional information about the assumed correlation is available it can be included into our test via 
choosing a proper weight function. In an analytic calculation we could derive a formula which 
allows to construct an optimum weight function from the expected quasar-galaxy correlation func- 
tion. The verification of this result by means of numerical simulations has also shown that the 
weighted-average correlation test can be more significant than Spearman's rank-order test, even 
if the weight function deviates considerably from its optimum. 

To look for a quasar-galaxy correlation we analysed the distribution of IRAS galaxies on circular 
fields around IJy quasars. This was carried out for different quasar subsamples by merging the 
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individual galaxy fields and subjecting the resulting total field to the weighted-average test. As we 
have explained and illustrated, the tests should be performed on galaxy fields as large as possible. 
Actually, the outer regions carry useful information about the mean galaxy number density, which 
is obviously needed to detect a possible galaxy overdensity in the center of the fields. Supplied 
with an appropriate weight function the weighted average can make use of this information. 

With small galaxy fields of radius ^out ~ 0.5 deg as well as with large ones of radius 0out — 2 
deg many of the quasar subsamples give rise to a low error level for the rejection of a purely 
random galaxy distribution. This is in agreement with previous findings in BS. As argued there, 
a correlation of galaxies with low-redshift quasars might be due to a spatial association, whereas 
a correlation with high-redshift quasars could be caused by gravitational lensing. Nevertheless 
there seems to be an anticorrelation between galaxies and quasars of redshift 1.0 < z < 1.25, and 
its interpretation in terms of these hypotheses is not obvious. However, it should be pointed out 
that this anticorrelation occurs with an error level larger than 10% and is therefore not of high 
significance. To visualize the angular galaxy distribution within different subsamples we compiled 
plots of the quasar-galaxy correlation function. Although they look quite different, this is, of 
course, not a statistically significant indication that they do not represent realisations of the same 
parent distribution. 

From additional simulations we expected to get further hints for the interpretation of our re- 
sults. We numerically generated a large number of synthetic galaxy fields with galaxies distributed 
according to the correlation function expected from gravitational lensing by the large-scale struc- 
ture as studied in Bartelmann ( 1995| ). On the basis of these results, we found that the lensing- 



induced correlations between IJy quasars and IRAS galaxies should be detectable neither with 
the weighted-average test nor with Spearman's rank-order test, because of the sparseness of the 
samples. 

Therefore, as stated above, we conclude that either 

• the low error levels are the result of a statistical fluctuation and do not correspond to a real 
correlation, or 



the numerical results in Bartelmann ( 1995| ) significantly underestimate the expected cor- 



relations between QSOs and IRAS galaxies, either because the non-linear density inhomo- 
geneities are not properly resolved in these simulations, or because the assumed redshift 
distribution is much shallower than that of IRAS galaxies, or 

the expected angular scale (f)o is smaller than obtained by Bartelmann, as might be indicated 
by the correlation function plotted in Fig. ^, or 

• for some reason the correlation is not described correctly by assuming a gravita tiona l lensing 
effect by the large-scale structure. However, in view of the results by Fort et al. ( 1995| ) quoted 
in the introduction, we consider this latter possibility unlikely. 
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Appendix A. Optimizing the weight function 

In Sect. |2.l| it was argued that the correlation coefficient r as defined in (^) can be optimized for 
distinguishing between two statistical hypotheses via maximizing \Qi\, given by Eq. (^). Here we 
intend to carry out the maximisation for 

1 ^ 

rg (</>!,..., 07v) := , (17) 

which was shown in Sect. |2.2| to be a limiting case of r applied to the analysis of quasar-galaxy 
correlations. Accordingly, the symbols ^i, . . . ,(/)Ar denote galaxy positions. 
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In addition, it will be demonstrated that the weight function g (0) we find in this way also 
represents nearly a stationary point of the mean error level (|^), provided the two hypotheses are 
"similar" . 

We assume the galaxy positions to be independent of one another, so their distribution can 
be characterised by a one-dimensional radial probability density p{(f>). Let Po (0) describe the 
radial galaxy distribution of the null-hypothesis, i.e. without any quasar-galaxy association. Fur- 
thermore, suppose we have some hints, e.g. from theory, that the real distribution of galaxies 
might be represented by the different probability density pa (0)- Following Sect. 2.2 we introduce 
a geometrical factor G ((/))> and write 



Po (</>) 



Pa {(!>) 



Co-G(</.) , 



Ca(</')-G(0) , 



Co G ((/.) d(f> = 1 



Ca(0)G((/.)d0= 1 



A.l The quantity Qi 

Using the above expressions we have 



(-.)o 



'9/a 



out p4>out 



N 



N 



n [Co G d<^,] = Co / G(0)d(/. 



i=i J j=i 

5(</')Ca(0)G(^)d0 , 



out rn^out 



N 



[rg i(t>i,...,(j}N)- {rg)J Yi ["^o ^ ("^j) ' 



out rn^out 



N 



-I 2 



N 



n [Co G d0,]- 



E / [5 Co G(0,) 



i=i J j=i 

2 



^ r<l>au 



iV2 



E 



t rvoat 



g (00 9 {c^,)clG (0.) G (0,) d0, d0, - (r,)^ 



1 

TV 



[.g(0)f CoG(</.)d0-l (r,)^ 



The substitution of r by Vg in definition (|j) then yields 

/.g(0) Ca (0)G(0) d0-/g(0)coG(0)i 



y^/ [g {4>)f c,G {4>) d0 - [/ 5 (0) CoG {^) . 

g('^O-/gWcoG(0)d0 

^/ [5 {4>)f c,G (0) d0 - [/ 5 (0) CoG (0) d, 
L(0,i?,W^;5(0))d0 , 



iVca(0')G(0')d0' 



(18) 



where all the integrations are to be performed over the interval [(j)in, 0out] and we have introduced 
the abbreviations 

L(0,£;,W^;g(0)):=V]Vca(0)G(0) , 
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E :-- 



W := 



' (0) Co G (0) dcj, 



[g(0)]'coG(^) d</) 



In order to maximize \Qi\ we investigate the influence of a variation e • h (0) of g ((/>). For that 
purpose, we substitute g (0) ^ 7 (0, e) := 5 (0) + e ■ h (0) and find 



Qi(£)= / °"L(0,i?(£),W^(£);7(0,£))d0 



with 



7 (0, e) Co G (0) d0 , 
[7 (0,e)]'coG(0)d0 



The condition for the weight function g (0) to produce an extreme value of Qi can now be expressed 
as 

/■out d 

d^' 



£ = 



^L(0,i?(e),W^(£);7(0,e)) 



d0 = 



(19) 



£=0 



which is equivalent to 



6=0 



dL 

— (0) 2g(i)/i(t)coG(i)dM0 = O 



/i (t) Co G (t) dM0 + 



(20) 



because of the relations 



dL dL d7 (0, e) dL dE (e) dL dW (e) 



and 



de 97 de dE de dW de 
d-y (0, e) 



de 

dE {e) 



de 
dW (e) 



de 



e=0 



e=0 



E=0 



= 2- 



/j(0) , 

/i (0) Co G (0) d0 , 
°"<7(0)/i(0)coG(0)d0 



Exchanging both the order of integrations and the variables and t in the second and third term 
of Eq. M) leads to 



dL 



dL , , 



£=0 

2g (0) h (0) Co G (0) dM0 



h (0) Co G (0) dtd0 + 

"g(0)/i(0) d0 = O 



£=0 



(21) 



where q (0) is defined to be 



^/CoG(0)/||(.) 



£=0 



(9/ 

dt + 2.g(0)coG(0) / — (i) 



di 



£=0 
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As Eq. (|T]) must hold for an arbitrary choice of h {(f) , Eq. ( p^ ) is finaUy reduced to 

q{(j)) = Q on e [(/)in, (j^ovLt] ■ 
The partial derivatives of L that enter q are 

Ca (0) G (0) 



'dW 



(0) 
(0) 



e=0 



e=0 



e=0 



TV • Ca ((/.) G (0) 
1 



E- 



g{4>)-E 



{W - E^) 
9 (0) - E 
{W-E^f 



1 

3 s/W -E"^ 



so after dividing by \/N/ [W — E"^) we have 



Ca (0) G ((/)) + Co G (0) E 



fgjt) Ca (t)G(t)dt-^ 



- CoG ((/)) - Co G (</>) .9 (0) w-E^ ■ 

We first note that, if G (0') = at a position this condition is fulfilled for an arbitrary finite 
value oi g {(p'), whereas for G ^ a division by Co G (0) yields 



CaW /gft)ca(t)G(^)dt-i; /g(t)caft)G(t)dt-i; ^ 



(22) 



Furthermore, it is easy to see that relation ( p2[ ) is invariant under linear transformations 

gW^giq^) ■.= ag{4>)+p , a^O , 



because 



E 
W 



E :■ 
W 



gi(l>)coG{(l))d(t) = aE + P , 
[g (</))]^ Co G ((/)) d(/) = a^W^ + 2aPE + f3^ 



and 



£;2 ^ Q,2^2 _^ 2a/3£; + /32 , 

- i^2 ^ Q,2 (14/ _ £;2) ^ 



g (t) Ca (i) G (t) dt - S a 
As a consequence, we can chose g (0) such that 



g (i) Ca (t) G (t) dt-E 



thereby simplifying (E2[) to 



Jgit)c.^)Git)dt-E _^ A ^ = 1 

i¥-£;2 

g (<^) = Ca (0) /Co . 



From this we derive the stationary points of Qi with respect to the weight function g (</>) to be 
given by 

g ((/))= aca ((^) + 6 , (23) 
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with arbitrary constants a ^ Q and b. 

The next step of our calculation is to show that the stationary point given by relation (BS 
specifies a local maximum of \Qi\. We consider an arbitrary deviation e ■ h (0) of g {(j)) from (03 



7(0,e) ~a - [c^{(t>)+e-h{(t>)] + b , 



(24) 



and analyse its effect on Qi: 



Qi (e) 



/7(<^,£)Ca((/>)G(0) d<^-/7(0,£)CoG((/>) d,^ 

\// [7 (0, e) - / 7 (t, e) Co G (i) dt] ' Co G (0) d0 



The denominator on the right hand side of this expression is equal to zero at some point £ = e' 
only if 

7 (</),£')= / l{t,e')coG{t)<lt , 



which is equivalent to ^{4>^ e') being a constant in (j). That, in turn, requires h (0) = — (l/£')ca {(j>) + 
const, resulting in 

7(0,£)=a(£)ca(0)+/3(£) , (25) 



where a (e) and [3 (e) are constants in (p. Equation (|25|), however, implies \Qi (e)! = const, as can 
easily be derived from the definition of Qi (e). In the subsequent investigation we will therefore, 
without loss of generality, exclude the case of a vanishing denominator of Qi (e). 
Introducing new abbreviations 



X := 



Y 



Z := 



h{(P)- h (t) Co G (t) dt 



G(0)d0 



[h (0)]' G ((/.) deb - Co 



h {t) G (t) dt 



> 



h (0) Ca (0) G (0) d0 - / /l (0) Co G (0) d0 



[Ca (0)-Co]G(0)i 



/i (0) - / /i (t) Co G (0 dt 



^ (0) Co G (0) d0 - 



and 



[Ca(0)]'G(0)d0^ 

[Ca (0)]' G (0) d0 - Co - y [Ca (0) - Co]^ G (0) d0 > 



91 (e) 



Vco^ + 2£CoF + e^CoX 



(26) 



we write Qi (e) in the form 



■ 11 (e) 



The first derivative of qi (e) in e is 



d_ CqY^ - c,XZ 

de'^^ " ^ {coZ - 2scoY + e^CoXf/"^ 

where, because of the Schwarz inequality^, the numerator on the right hand side is always lower 
than or equal to zero. This means qi (e) is monotonously increasing for e < but monotonously 

^ Given three functions u, : IR — > IR, G ; IR — > IRg" the Schwarz inequahty states 

[J u{x)v{x)G{x)dxY < J [u{x)f G{x)dx J [v{x)f G{x)dx . 
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decreasing for e > 0. Accordingly, qi (e) takes its global maximum at i 
Oand|gi(£)| 



which is, as qi (e = 0) > 



\qi at least a local maximum of \Qi {e)\. 
At this point we know condition (|2^) to correspond to a local maximum of \Qi\. The final goal 
of the following arguments is to demonstrate that this local maximum is also the global maximum. 
Taking into account that \Qi {e)\ = \qi {e)\ and that the global maximum of qi (e) is located at 
e = with qi (0) > it is sufficient to prove \ql\ < qi (0) for the absolute minimum ql of qi (e). 
Because of the monotony of qi (e) we obtain 



q^ = mm < lim qi (e) , lim qi (e) 

6 — > — Qo e — *oo 



with 



Hm qi (e) 



lim qi (e) = r/VX 



as can be seen from definition (|2^) . Depending on the sign of Y one of the limits is greater than or 
equal to zero but lower than or equal to qi (0), because qi (0) is the absolute maximum of qi (e). 
Therefore, it is 



gi (0) > lim qi [e) 



hm qi (e) 



191 1 



which means \Qi\ is globally maximized if the weight function .g (0) of the correlation coefficient 



|17|) is in agreement with Eq. 122 



A. 2 The mean error level 

In analogy to Eq. (||) one can define the mean error level (Pq > R)) of the correlation coefficient 
rg. A good distinction between the two galaxy distributions described by Po (</>) and Pa ((A) is 
possible if the mean error level is low. Therefore, we would like the weight function g {(j)) to 
minimize (Po {vg > R)). For this it is a necessary condition that the mean error level is stationary 
with respect to small variations e • h (0) of g (0): 



d£ 



(Po (r^ > P)> 



, 



(27) 



e=0 



with 7 (0, e) := g (0) + s h (0). If we write Pa (0) in the form 

Pa (0) := Po (0) (1+ 77 (</-)) 



and fix v (0), then in general g (0) will depend on 77. But as (|2^) must hold for any value of 77, we 
have 



(28) 



e=0 



Proof: With the definitions 

A := J [u{x)f G{x)dx > , 
B := J u (x) V (x) G (x) dx , 
C := J [v{x)fG{x)dx > , 

and an arbitrary o G IR we have 

0<C J[u{x) + av {x)fGix)dx = (B + aCf + AC - ^ 

so that by choosing a = —B/C we obtain A - C — B^ > or equivalently B^ < A - C for C ^ 0. For the remaining 
case of C = we can write 

< J[u{x) + av {x)f G{x)dx = A + 2aB , 
which must hold for any value of a. From that we derive B = 0, which is consistent with B^ < A ■ C = 0, q.e.d. 
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What we want to prove here is that for 77 ^ both Eq. (27) and Eq. ( |28| ) hold if g {(p) is in 
agreement with Eq. (p3[). For convenience, let us define the symbols 

Pog (i?) := 1 - Fo {rg >R)^Po {rg < i?) = 

= Po{<i^i)---Po{<tyN)o(^R-^Ylg{(p^)^ d''(p , 

1 \ 



Pog (r) 
P<^g ('") 



:= P,{rg<R)= /••• / Pa •••P: 



■■■Pa ((PN) 



_d 

dr 



-Po5(?')= I ■ ■ ■ I Po{(l>l) ■ ■ ■Po{<pN 



d_ 

dr 



Pa (0l) • ■ - Pa (0Ar) ^ 



with Dirac's delta function S (r) and Heaviside's step function 9 (r). All the integrations have to 
be performed over [4>in, '/'out]- A partial integration of the mean error level yields 

/OO pOG 
Poj {r) Pa7 (r) dr = / (t-) P07 {r) dr . 

It is evident that for — > the mean error level reaches the constant value (P07 [f-y > r)) — 
J Pog (^)Pog (r) dr = 1/2 for an arbitrary weight function g, so obviously Eq. (27) is fulfilled. 
Furthermore, we find 



d__d 

drj de 



— —{Poi {r^>r)) 



e = 
J7 = 



d_ 

drj 



d 



d 



P07 [r) -Q^P^-r (r) + ^a7 (r) g^Poj (r) 



dr 



£=0 

T7=0 



a7 (r) - Pa7 (^) ^^07 (^) 



= 1^ [Po. (r) |Pa. 

7/ — U 

1 /• /• ( 1 ^ 1 ^ \ 

JV JV Iff 

X ^ /'^ (<^fe) ^ (0z) d^0d^0' -/.../ p„ .. (0^) p„ (0;) . . (0^) 
fc=i i=i 

( Af 1 \ ^ ^ 

N E 3 (<^^) -nT.9 i^',) E ^ ('^'^■) E ^ (<^?) d^0d^0' = 
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= '?/••• / Po {(t)l) ■ ■ ■ Po {(^n) Po((/>i)---Po((/>W) X 



x<5 



N \ ^ ^ 



d^(^d^0' 



i=i 



fc=l 



Because of the delta function only those points . . . ,(f)]\r,(j)[, . . . ,(f)'j^) contribute to the 
integral which meet the condition 



N 

E[5(<^O-5(0-) 



= 



(29) 



Now suppose g ((/>) to be in accordance with expression (E3[). Then (E9[) implies 



N 



^ 



fc=i 
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and it is 



= 



e = 

Tj=n 



This final result shows that the weight function specified by relation ( p3| ) not only maximizes 
the quantity \Qi\ as discussed in the preceding section, but is also close to a stationary point of 
the mean error level, if 1] w 0, i.e. pa (0) ~ Po (0)- 
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